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ABSTRACT 

Feature -based  methods  have  been  recently  considered  in  the  literature  for  detection  of  stationary  human  targets  in 
through-the-wall  radar  imagery.  Specifically,  textural  features,  such  as  contrast,  correlation,  energy,  entropy,  and 
homogeneity,  have  been  extracted  from  gray -level  co-occurrence  matrices  (GLCMs)  to  aid  in  discriminating  the  true 
targets  from  multipath  ghosts  and  clutter  that  closely  mimic  the  target  in  size  and  intensity.  In  this  paper,  we  address  the 
task  of  feature  selection  to  identify  the  relevant  subset  of  features  in  the  GLCM  domain,  while  discarding  those  that  are 
either  redundant  or  confusing,  thereby  improving  the  performance  of  feature -based  scheme  to  distinguish  between 
targets  and  ghosts/clutter.  We  apply  a  Decision  Tree  algorithm  to  find  the  optimal  combination  of  co-occurrence  based 
textural  features  for  the  problem  at  hand.  We  employ  a  K-Nearest  Neighbor  classifier  to  evaluate  the  performance  of  the 
optimal  textural  feature  based  scheme  in  terms  of  its  target  and  ghost/clutter  discrimination  capability  and  use  real-data 
collected  with  the  vehicle -borne  multi-channel  through-the-wall  radar  imaging  system  by  Defence  Research  and 
Development  Canada.  For  the  specific  data  analyzed,  it  is  shown  that  the  identified  dominant  features  yield  a  higher 
classification  accuracy,  with  lower  number  of  false  alarms  and  missed  detections,  compared  to  the  full  GLCM  based 
feature  set. 

Keywords:  Through-the-wall  radar  imaging,  feature  selection,  target  detection,  co-occurrence  matrix. 


1.  INTRODUCTION 

Through-the-wall  radar  imaging  (TWRI)  covers  a  broad  range  of  applications  in  both  civilian  and  military  contexts, 
ranging  from  surveillance  and  reconnaissance  to  hostage  rescue  missions  and  searching  for  survivors  in  natural  disasters. 
One  of  the  primary  objectives  of  TWRI  is  to  provide  means  for  detection  of  stationary  humans  obscured  by  walls.1,2  This 
highly  desirable  objective  is  challenged  by  the  presence  of  strong  clutter  caused  by  the  electromagnetic  (EM)  scatterings 
from  the  building  stmcture  and  other  stationary  indoor  objects,  and  also  by  the  rich  multipath  returns  resulting  from 
target  interactions  with  the  indoor  environment.  The  losses  encountered  by  the  signal  due  to  the  presence  of  exterior  and 
interior  walls  between  the  radar  and  the  targets  limit  the  use  of  biometric  features,  such  as  breathing  and  heartbeat,  for 
identifying  stationary  humans  inside  buildings.  As  such,  despite  the  presence  of  clutter  and  multipath  ghosts  in  radar 
images,  most  of  the  efforts  related  to  stationary  indoor  target  detection  have  been  focused  solely  on  the  development  of 
effective  techniques  in  the  image  domain.3'7 

Feature -based  methods  have  shown  promise  in  discriminating  the  true  targets  from  multipath  ghosts  and  clutter  that 
closely  mimic  the  targets  in  size  and  intensity  in  through-the-wall  radar  imagery.7,8  More  specifically,  target  and  clutter 
discriminating  characteristics  in  synthetic  aperture  radar  (SAR)  based  indoor  images  have  been  captured  through  textural 
feature  extraction  from  the  gray  level  co-occurrence  matrices  (GLCM).  GLCMs  encapsulate  the  local  spatial 
relationships  among  the  gray  levels  of  neighboring  image  pixels  and  have  found  widespread  application  in  optical  and 
medical  image  analyses.9'11  Five  commonly  used  co-occurrence  based  textural  features,  namely,  contrast,  correlation, 
energy,  entropy,  and  homogeneity,  are  obtained  from  known  target  and  ghost/clutter  regions  in  through -the-wall  radar 
images  and  are  used  to  train  a  minimum  distance  classifier.7 
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Figure  1.  Example  of  a  3D  SAR  image,  based  on  real  data  experiments  conducted  by  DRDC. 


In  this  paper,  we  address  the  task  of  feature  selection  in  order  to  identify  a  relevant  subset  of  the  aforementioned  five  co¬ 
occurrence  features,  while  discarding  those  that  are  either  redundant  or  confusing,  thereby  improving  the  performance  of 
the  feature  based  detection  technique  of  Ref.  [7].  A  Decision  Tree  algorithm12  is  applied  to  find  the  optimal  combination 
of  features  for  the  classification  problem  at  hand.  A  K-Nearest  Neighbor  (KNN)  classifier13,14  is  employed  for 
performance  evaluation  of  the  identified  dominant  features  in  terms  of  their  target  and  ghost  discrimination  capability 
and  comparison  with  the  hill  feature  set.  To  this  end,  we  use  real  three-dimensional  (3D)  images  acquired  with  the 
vehicle -borne  multi-channel  through-the-wall  radar  imaging  system  developed  by  Defence  Research  and  Development 
Canada  (DRDC).15  The  specific  dataset  corresponds  to  through-the-wall  measurements  of  a  small  room  with  six  human 
occupants,  with  one  person  sitting  on  the  floor  while  the  others  standing  at  various  locations.  We  show  that,  for  the 
specific  data  analyzed,  the  energy  and  entropy  are  identified  as  the  dominant  textural  features  and  provide  superior 
classification  performance  over  the  hill  feature  set. 

The  remainder  of  the  paper  is  organized  as  follows.  Section  2  reviews  the  co-occurrence  featured  based  detection 
technique,  highlighting  the  five  considered  textural  features.  Feature  selection  based  on  decision  trees  is  presented  in 
Section  3.  Performance  comparison  of  the  identified  dominant  features  and  the  full  feature  set  using  real  images  is 
provided  in  Section  4.  Section  5  provides  the  conclusion. 


2.  CO-OCCURRENCE  FEATURE  BASED  DETECTION  TECHNIQUE 

In  this  section,  we  review  the  co-occurrence  feature  based  image  domain  detection  scheme  proposed  in  Ref.  [7]  for 
target  and  clutter/ghost  discrimination  in  TWRI. 

Consider  a  3D  image  of  size  N  x  M  x  L,  whose  pixels  can  assume  an  intensity  value  horn  the  set  {0,  1,  ...,J-1},  where  J 
denotes  the  total  number  of  intensity  levels.  Figure  1  shows  an  example  of  such  a  3D  SAR  through-the-wall  image. 

2.1  Gray  Level  Co-occurrence  Matrix 

A  co-occurrence  matrix  is  defined  as  a  two-dimensional  (2D)  histogram  of  gray  levels  for  a  pair  of  pixels,  which  are 
separated  by  a  fixed  spatial  relationship,  specified  in  terms  of  distance  and  direction.  With  J  as  the  total  number  of 
intensity  levels  in  the  image  under  consideration,  the  (p,  g)-th  element  of  a  J  x  J  GLCM  Gd ,  corresponding  to 

displacement  d  =  (dx,dy,dz),  is  the  relative  frequency  with  which  two  neighboring  pixels  displaced  by  d  occur  in  the 

image,  one  with  gray  level  p,  and  the  other  with  gray  level  q.  Formally,  the  (p,  g)-th  element  of  Gd  reads  as16 
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where  p,q  =  o,l, ... ,  J  ”  1,  and  i(n,m,l)  and/O  +  dx,m  +  dy,l  +  iz)  are  the  intensity  values  of  the  two  pixels  with  the 
spatial  relationship  d.  Typically  used  values  for  the  displacement  d  comprise  an  offset  of  one  to  two  pixels  in  thirteen 
possible  directions  represented  by  azimuth  (j)  and  elevation  0  ,  each  ranging  from  0°  to  135°  in  45°  increments.16,17 

After  constructing  the  GLCM  for  a  given  d ,  we  normalize  the  GLCM  so  that  the  sum  of  its  elements  is  equal  to  1.  That 
is,  the  (p ,  g)-th  element  of  the  normalized  GLCM  is  given  by 


Gd(p,q ) 


Gd(p,q ) 


£  !<?„(/>,?) 


p  ~  0  q  ~  0 


(2) 


Then,  G  d{p,q)  is  the  joint  probability  of  occurrence  of  pixel  pairs  with  a  defined  spatial  relationship  d  having  gray  level 
values  p  and  q  in  the  3D  image. 

2.2  Feature  Extraction 

Five  different  features,  namely,  contrast,  correlation,  energy,  entropy,  and  homogeneity,  are  extracted  from  each 
normalized  GLCM.7,9, 17  Contrast  measures  the  amount  of  local  intensity  variations  present  in  the  image  and  is  defined 
as 

Contrast  ~  I  d(p,q )  (?) 

p,q 

The  correlation  feature  is  a  measure  of  gray  level  linear  dependencies  in  the  image  and  is  given  by 
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where  ^and  ^2,  cr2  are  the  means  and  standard  deviations  of  the  respective  marginal  distributions 
Gd(p)  =  Z  g d(p,q)  and  G d  (q)  =  Y.  g d{p,q)  associated  with  the  normalized  GLCM.  The  energy  feature  measures  the 

q  p 

textural  uniformity  and  is  defined  as 


Energy  Y  Gd{p,q))2  (5) 
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The  entropy  feature  measures  the  disorder  or  complexity  of  the  image  and  is  defined  as 

Entropy  =  Y  Gd  (p,  q)  log  Gd  (p,  q)  (6) 

p>q 

Finally,  the  homogeneity  feature  measures  the  closeness  of  the  distribution  of  elements  in  the  co-occurrence  matrixto  the 
co-occurrence  matrix  diagonal.  It  is  defined  as 


^  ^  1 

Homogeneit  y  =  Y  - G  d(p,q)  (7) 
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For  the  through-the-wall  stationary  human  detection  problem,  the  aforementioned  textural  features  are  extracted  from  the 
GLCM’s  for  the  aforementioned  twenty -six  displacement  vectors.  Therefore,  the  length  of  the  resulting  feature  vector  is 
130.  It  is  noted  that,  instead  of  the  entire  image,  the  textural  feature  vectors  are  computed  for  known  target  and 
ghost/clutter  regions  for  the  training  set  and  for  those  regions  of  the  test  3D  image,  which  are  identified  as  candidate 
target  regions. 


2.3  K-NN  Classifier 


We  employ  a  simple  supervised  learning  algorithm,  namely,  the  K  Nearest  Neighbors  classifier,  which  is  commonly 
used  in  learning  and  classification.  Therein,  an  object  is  classified  based  on  the  “distance”  of  its  features  from  those  of  its 
neighbors,  with  the  object  being  assigned  to  the  class  most  common  among  its  K nearest  neighbors.14  Euclidean  distance 
is  the  commonly  used  distance  metric.  The  neighbors  are  taken  from  a  set  of  objects,  called  the  training  set,  for  which 
the  correct  classification  is  known. 

•  If  K  =  1,  the  algorithm  simply  becomes  nearest  neighbor  algorithm  and  the  object  is  classified  to  the  class  of  its 
nearest  neighbor. 

•  If  K  >1,  the  object  is  assigned  to  the  class  of  the  majority  of  its  K  nearest  neighbors. 

Typically,  K  is  chosen  to  be  odd  when  the  number  of  classes  is  2  to  resolve  any  ties.  A  higher  K  increases  the 
classification  accuracy  but  at  the  expense  of  computational  time. 

2.4  Performance  Metrics 

We  consider  three  metrics,  namely,  accuracy,  missed  detection  rate,  and  false  alarm  rate,  to  provide  a  quantitative 
assessment  of  the  classification  technique.  These  metric  are  defined  as  follows. 

_  Number  of  objects  correctly  classified  ,  . 

Accuracy  ~  -  (o) 

Total  number  of  objects 

_  Number  of  targets  incorrect  1  y  classified  ,  . 

Missed  Detections  ~  -  (9) 

Total  number  of  objects 

_  Number  of  ghosts/clu  tter  incorrectl  y  classified 

False  Alarms  ~  -  (lb) 

Total  number  of  objects 


3.  FEATURE  SELECTION 

We  consider  decision  tree  analysis  to  identify  a  subset  of  the  considered  features  that  is  most  relevant  for  distinguishing 
targets  from  clutter/ghost  regions  in  the  3D  images.  Decision  tree  based  scheme  is  a  nonparametric  approach  which  does 
not  require  any  prior  assumptions  about  the  probability  distributions  of  the  various  features.20  A  decision  tree  is  a 
hierarchical  structure,  which  consists  of  directed  edges  and  three  type  of  nodes:  i)  A  root  node  that  has  no  incoming 
edges  and  zero  or  more  outgoing  edges,  ii)  Internal  nodes,  each  of  which  has  exactly  one  incoming  edge  and  two  or  more 
outgoing  edges,  and  iii)  Leaf  or  terminal  nodes,  each  of  which  has  exactly  one  incoming  edge  and  no  outgoing  edges. 
The  tree  is  typically  grown  as  a  recursive  partitioning  of  the  training  samples  into  successively  purer  subsets.  If  all  of  the 
training  samples  associated  with  a  particular  node  t  belong  to  the  same  class,  then  Ds  a  leaf  node  and  gets  assigned  a 
class  label.  On  the  other  hand,  if  the  training  samples  associated  with  node  t  belong  to  different  classes,  then  a  single 
feature  test  condition  is  chosen  to  separate  the  sample  points  into  smaller  subsets.  A  child  node  is  created  for  each 
outcome  of  the  test  condition  and  the  records  associated  with  the  parent  node  t  are  distributed  to  the  children  based  on 
the  outcomes.  The  algorithm  is  then  recursively  applied  to  each  child  node  until  a  stopping  criterion  is  met. 

For  the  target  and  clutter/multipath  discrimination  problem,  there  are  only  two  classes  and  the  considered  features  take 
continuous  values.  As  such,  the  test  condition  at  the  root  and  internal  nodes  takes  the  form  of  a  comparison  test  with 
binary  outcomes.  That  is,  the  feature  value  is  compared  to  a  threshold  and  the  training  samples  are  split  accordingly. 
The  design  issues  that  need  to  be  addressed  are  i)  the  determination  of  appropriate  thresholds  for  the  various  features, 
and  ii)  the  selection  of  the  best  feature  to  use  at  a  particular  node  for  making  the  split.  For  the  latter,  a  goodness  criterion 
is  used  to  determine  how  well  the  various  feature  test  conditions  perform.  A  typical  strategy  is  to  select  the  feature  test 
condition  that  minimizes  the  weighted  average  A  of  an  impurity  measure  h  ()  of  the  child  nodes,  given  by 

A  =  I  Q-i-H(yt) 

o  Q 
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Figure  2.  (a)  Through-the-wall  MIMO  System,  (b)  Building  used  for  Through-the-Wall  Measurements  (the  dashed  square 
indicates  the  room  containing  the  human  targets,  (c)  Scene  with  six  human  subjects.  Photos  by  J.  Lang,  DRDC  Ottawa. 


where  Q  is  the  total  number  of  training  samples  at  the  parent  node  t  and  Q.  is  the  number  of  training  samples  associated 
with  the  child  node  v .  .  The  Gini  index  is  a  commonly  used  inpurity  measure,  which  is  defined  for  the  underlying  two- 
class  problem  as 
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=  1-  X 


P(1  |  v,.)  =  1  ~~  P(0  |  v.) 
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with  p(j  |  v. )  being  the  fraction  of  sanples  belonging  to  class  j  at  a  given  node  v. . 

For  each  feature  ‘X’,  the  threshold  for  the  comparison  test  can  also  be  determined  by  using  the  weighted  average  of  the 
Gini  index.  The  training  sanples  are  first  sorted  based  on  the  values  they  take  for  the  feature  X  and  candidate  thresholds 
are  identified  by  taking  the  midpoints  between  two  adjacent  sorted  values .  For  each  candidate  threshold,  the  data  set  is 
scanned  to  count  the  number  of  training  sanples  less  than  or  greater  than  the  candidate.  The  Gini  index  values  for  the 
corresponding  child  nodes  and  their  weighted  average  is  then  computed.  The  candidate  that  produces  the  lowest 
weighted  average  of  the  Gini  index  is  chosen  as  the  threshold  for  feature  X. 


4.  EXPERIMENTAL  RESULTS 

We  use  real  3D  images  collected  with  DRDC’s  through-the-wall  multi-channel  radar  system.15  The  radar  is  installed 
inside  a  vehicle  with  its  two  transmit  antennas  and  an  eight -element  receive  array  mounted  on  the  side  of  the  vehicle,  as 
shown  in  Fig.  2(a).  The  antenna  elements  are  compact  Y-shaped  printed  bowtie  antennas  and,  when  used  in  the  vertical 
polarization,  have  approximately  60°  beamwidth  in  the  elevation  direction  and  150°  beamwidth  in  the  azimuth  or 
horizontal  direction.18  The  receive  array  has  an  inter-element  spacing  of  15  cm,  and  the  two  transmit  antennas  are 
separated  by  1.2  m  The  transmit  and  receive  array  antennas  have  a  horizontal  spacing  of  2  m  A  frequency -modulated 
continuous  wave  signal  covering  the  0.8  to  2.7  GHz  frequency  band  is  used  as  the  transmit  signal.  A  switch  is  used  to 
alternate  the  radar  transmissions  between  the  two  transmit  antennas,  and  the  eight -channel  radar  receiver  digitizes  the 
eight  received  signals  for  each  radar  transmission. 
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Figure  3.  Decision  tree  grown  using  5  attributes  extracted  from  26  GLCMs  using  1 1  target  and  1 1  clutter  regions.  The  labels  XI, 
X2,  X3,X4,  andX5  denote  energy,  contrast,  correlation,  homogeneity,  and  entropy,  respectively. 


A  small  room  in  the  Troop  Shelter  building,  shown  in  Fig.  2(b),  was  imaged  three  different  times,  with  six,  four,  and  one 
human  occupant,  respectively.  The  antennas  were  lowered  on  the  van  between  measurements  from  the  first  scene  and  the 
latter  scenes,  which  resulted  in  a  considerable  increase  in  clutter.  Fig.  2(c)  depicts  the  scene  with  the  six  human  targets. 
The  exterior  walls  of  the  building  are  constructed  of  vinyl,  chip  board  and  dry  wall  on  a  16  in.  spacing  wood  stud  frame. 
The  raw  radar  data  were  collected  while  the  vehicle  moved  along  a  straight  path  parallel  to  the  front  wall  of  the  building, 
allowing  3D  images  to  be  generated  in  downrange,  azimuth,  and  elevation  using  backp rejection. 

Eleven  target  regions  and  eleven  clutter  regions  were  extracted  from  the  3D  through -the -wall  images.  GLCM 
computations  corresponding  to  the  26  displacements  were  carried  out,  followed  by  the  extraction  of  the  130-element 
textural  feature  vectors,  for  each  target  and  clutter  regions.  Thus,  we  had  a  total  of  11  target  feature  vectors  and  an  equal 
number  of  clutter  feature  vectors. 

4.1  Decision  Tree  Analysis 

We  applied  decision  tree  analysis  to  determine  the  dominant  extracted  features,  which  would  provide  reliable 
discrimination  between  the  targets  and  clutter.  In  order  to  reduce  the  computational  complexity  and  improve  the  ease  of 
interpretation,  we  chose  not  to  identify  dominant  features  from  amongst  the  130  total  features.  Rather,  we  decided  to 
determine  the  dominant  attribute  from  amongst  contrast,  correlation,  energy,  entropy,  and  homogeneity.  Thus,  the  values 
taken  by  these  attributes  under  different  displacements  served  as  additional  training  sample  points  for  the  target  and 
clutter  classes.  Figure  3  shows  the  resulting  decision  tree.  We  observe  that  energy  and  entropy  play  a  dominant  role  as 
they  appear  towards  the  top  of  the  tree  structure,  whereas  correlation  comes  in  a  distant  third.  Homogeneity  and  contrast 
have  been  identified  as  irrelevant  attributes  for  the  classification  problem  at  hand  since  they  do  not  appear  in  the  tree. 

4.2  Classification  Performance  Comparison 

We  first  performed  classification  using  the  full  feature  set,  i.e.,  the  feature  vector  of  length  130.  Because  of  the 
availability  of  limited  data  (only  11  targets  and  11  clutter  samples),  we  used  leave-one-out  cross  validation,19  wherein 
the  classification  was  performed  22  times,  using  one  feature  vector  from  the  dataset  for  testing  and  the  remaining  for 


Table  1.  Performance  comparison  between  the  dominant  features  and  the  full  feature  set. 


Performance  Metric 

Full  Feature  Set 

Energy  &  Entropy 

Accuracy 

77.3% 

90.9% 

False  Alarm  Rate 

4.5% 

0% 

Missed  Detection  Rate 

18.2% 

9.1% 

training  each  time.  In  this  way,  all  of  the  target  and  ghost/clutter  regions  in  the  dataset  were  used  for  both  training  and 
testing.  We  used  a  value  of  K=3  for  the  K-NN  classifier.  Table  1  (second  column)  provides  the  corresponding  values  of 
the  performance  metrics.  We  note  that  the  classification  accuracy  is  77.3%,  with  4.5%  false  alarms  and  18.2%  missed 
detections. 

Next,  having  identified  the  dominant  features  as  energy  and  entropy,  we  proceed  with  classification  using  only  the 
aforementioned  textural  features  extracted  from  the  26  GLCMs.  The  new  feature  vector  length  is  52.  The  third  column 
of  Table  1  provides  the  corresponding  values  of  the  performance  metrics  when  a  K-NN  classifier  with  cross-validation 
was  used  for  K=3.  We  observe  that,  compared  to  the  full  feature  set  case,  the  classification  accuracy  has  increased  by 
13.6%  with  no  false  alarms  and  a  9.1%  reduction  in  missed  detections.  This  validates  the  improved  performance  of  the 
selected  features  in  discriminating  humans  from  ghosts/clutter. 


5.  CONCLUSION 

In  this  paper,  we  presented  decision  tree  analysis  to  identify  the  dominant  and  most  discriminating  GLCM  based  textural 
features  for  improved  capability  to  distinguish  between  targets  and  ghosts/clutter  in  through-the-wall  radar  imaging 
applications.  For  the  data  analyzed,  the  energy  and  entropy  attributes  were  determined  to  be  the  most  relevant  amongst 
the  set  of  five  commonly  used  GLCM  features,  which  also  included  contrast,  correlation,  and  homogeneity.  The 
performance  of  the  feature  based  scheme  based  on  the  dominant  attributes  was  evaluated  using  a  K-Nearest  Neighbor 
classifier.  It  was  shown  that,  compared  to  the  scheme  based  on  all  five  attributes,  the  dominant  features  yielded  a  higher 
classification  accuracy,  with  lower  number  of  false  alarms  and  missed  detections. 
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